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Abstract 

Most diagnostic laboratories are confronted with the increasing demand for molecular diagnosis from patients and families 
and the ever-increasing genetic heterogeneity of visual disorders. Concerning Retinal Dystrophies (RD), almost 200 
causative genes have been reported to date, and most families carry private mutations. We aimed to approach RD genetic 
diagnosis using all the available genetic information to prioritize candidates for mutational screening, and then restrict the 
number of cases to be analyzed by massive sequencing. We constructed and optimized a comprehensive cosegregation RD- 
chip based on SNP genotyping and haplotype analysis. The RD-chip allows to genotype 768 selected SNPs (closely linked to 
100 RD causative genes) in a single cost-, time-effective step. Full diagnosis was attained in 17/36 Spanish pedigrees, 
yielding 12 new and 12 previously reported mutations in 9 RD genes. The most frequently mutated genes were USH2A and 
CRB1. Notably, RD3-up to now only associated to Leber Congenital Amaurosis- was identified as causative of Retinitis 
Pigmentosa. The main assets of the RD-chip are: i) the robustness of the genetic information that underscores the most 
probable candidates, ii) the invaluable clues in cases of shared haplotypes, which are indicative of a common founder effect, 
and iii) the detection of extended haplotypes over closely mapping genes, which substantiates cosegregation, although the 
assumptions in which the genetic analysis is based could exceptionally lead astray. The combination of the genetic 
approach with whole exome sequencing (WES) greatly increases the diagnosis efficiency, and revealed novel mutations in 
USH2A and GUCY2D. Overall, the RD-chip diagnosis efficiency ranges from 16% in dominant, to 80% in consanguineous 
recessive pedigrees, with an average of 47%, well within the upper range of massive sequencing approaches, highlighting 
the validity of this time- and cost-effective approach whilst high-throughput methodologies become amenable for routine 
diagnosis in medium sized labs. 
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Introduction 

Retinal dystrophies (RD) are a group of more than 25 genetic 
visual disorders [1]. Although RDs rank among mendelian rare 
diseases, taken together, they occur at an estimated prevalence of 
1-2 patients per 1000 individuals. In fact, the most frequent form 
of RD, retinitis pigmentosa (RP), affects 1.5 million individuals 
worldwide [2]. The clinical traits underlying these disorders 
disturb from the macular region (central vision) to the outlying 
retinal area (peripheral vision). In addition, at least 30 different 
syndromes (such as Usher and Bardet-Biedl) share some of these 
phenotypic alterations [3,4]. On the genetic side, more than 5000 
mutations in almost 200 genes are causative of retinal dystrophies 
so far [1,5,6]. Yet, around 35% of the cases remain unassigned [7]. 
The extreme heterogeneity of RDs at the clinical and genetic levels 
hinders the accurate clinical assessment, patient management, and 
genetic counseling. Within this context, molecular diagnosis, 



however challenging, is instrumental to improve the diagnosis 
and prognosis of RDs and guide future therapies [7-9] . 

Currently, the most demanding issue in RD molecular diagnosis 
is the prioritization of methodological strategies, where the main 
parameters to be balanced are cost, time and yield. These 
parameters strongly depend on phenotypic clinical assessment, 
pedigree information, sample availability and methodological 
resources. Most genetic laboratories resort to direct mutational 
screening when the clinical traits and/ or the genetic information 
associated to the disease limit the number of candidates to be 
analyzed. Unfortunately, this is not a common case for RDs, and 
this type of analysis would imply screening more than 1 500 exons. 
The search for an alternative cost-effective approach is mainly 
being performed using high-throughput platforms, in particular 
massive sequencing, which require powerful and sophisticated 
bioinformatics tools for analyzing and filtering the data [10,11], 
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To improve diagnosis, we have focused on a comprehensive 
strategy based on the clinical phenotype and all available genetic 
data prior to either analyze a reduced manageable number of 
candidate genes or resort to massive sequencing. We have 
generated and optimized a SNP-based chip for haplotype 
cosegregation analysis [12,13] to genotype 7-10 SNP markers of 
one hundred genes associated to the most prevalent RDs: Cone 
Dystrophies (CD), Cone-Rod Dystrophies (CRD), Congenital 
Stationary Night Blindness (CSNB), Leber Congenital Amaurosis 
(LCA), Macular Degeneration (MD) and RP. Based on this 
methodology, a multi-tiered approach has been devised to cost- 
effectively diagnose [14] a panel of 36 Spanish families with non- 
syndromic retinal dystrophies plus 5 patients with Usher's 
syndrome. As a result, we have identified the pathogenic mutation 
of 1 7 out of the 36 families, and 3 of the 5 isolated Usher patients, 
overall reporting 14 novel mutations. After the RD-chip analysis 
discarded all known RD genes, Whole Exome Sequencing (WES) 
was undertaken in two pedigrees. The pathogenic mutations were 
unexpectedly identified in two RD candidates, which had been 
previously discarded as non-cosegregating on the basis of 
homozygosity by descent in consanguineous families, and infre- 
quent recombination of closely mapping SNPs. 

Methods 

Patients 

Thirty-six Spanish families diagnosed with RP, LCA, CRD or 
CD plus 5 isolated Usher's syndrome patients were recruited for 
this study. Written informed consent from the patients and 
relatives was obtained following the tenets of the Declaration of 
Helsinki. Patient recruitment and sample collection procedures 
had been previously approved by the Bioethics Committee of the 
University of Barcelona (Barcelona, Spain). Peripheral blood DNA 
was obtained using the MoleStrips DNA Blood kit with the 
GeneMole instrument (Mole Genetics, Mole, Lysaker, Norway). 
DNA from Spanish control individuals was obtained from 
peripheral blood using the same methodology. 

SNP Selection 

The RD chip for the molecular diagnosis of Mendelian non- 
syndromic retinal dystrophies was an optimized version of a 
previous cosegregation chip for RP-LCA disorders [12,13]. Seven 
to ten SNPs were selected for each candidate (100 genes in total), 
and genotyped on a customized Golden Gate Genotyping Assay 
(Illumina). The SNPs were selected following: i) high informativity 
according to SNPbrowser Software Version 4.0.1 and dbSNP 
database (www.nlm.nih.gov/projects/SNP/); ii) physical location 
(covering upstream, intragenic and downstream regions); iii) 
inclusion in different haplotypic blocks. The genes analyzed by 
this RD chip are listed by chromosome position in Figure 1. In 
addition, some common mutations in ABCA4, CERKL, COLSA2, 
CRB1, LRP5, NR2E3, PRPF31, RHO and USH2A, were included 
for direct genotyping. 

High-throughput SNP Genotyping 

One microgram of sample DNA (at 20 ng/ ui) was laid in 96- 
well plates. SNPs were genotyped using the Golden Gate Array 
(Illumina) platform following the instructions, protocol and 
software provided by the manufacturers. Haplotype and cose- 
gregation analyzes were performed by hand from the raw data 
genotypes. 



Mutational Screening 

In Usher syndrome samples, where a major causative gene is 
involved, the Arrayed Primer Extension (APEX) approach plus 
direct gene sequencing was the molecular diagnosis procedure of 
selection. In all other cases, cosegregation analysis with the RD- 
chip allowed to highlight the best candidates for mutational 
screening. All the exons and exon-intron boundaries of selected 
genes were directly screened for mutations in each patient. 
Genomic DNA was amplified, purified on High Pure 96 UF 
Cleaning Plates (Roche) and sequenced using the BigDye v3.1 kit 
(Applied Biosystems, Inc.) in the ABI PRISM 3730 DNA 
sequencer (Applied Biosystems, Inc.). 

All missense changes identified were verified in control 
population using the dbSNP database (Build 137, www.ncbi.nlm. 
nih.gov/projects/SNP/), the 1000 Genomes Project data (http:// 
browser.1000genomes.org/index.html), and ESP6500 data of the 
National Heart, Lung, and Blood Institute GO Exome Sequencing 
Project (http://evs.gs.washington.edu/EVS). To validate unre- 
ported missense genetic variants, over one hundred matched 
controls were analyzed to discard rare non-pathogenic polymor- 
phisms restricted to the Spanish population. 

Bioinformatic Analyses 

All the sequences were analyzed using the sequence assembly 
software Seqman (DNAStar, Madison, WI) and aligned to the 
reference gene sequence (Genome Reference Consortium human 
genome build 37, human genome 19). 

The pathogenicity of all new missense changes identified in 
patients was evaluated using the in silico predictors SIFT (http:// 
sift.jcvi.org/[15]) and PolyPhen-2 (http://genetics.bwh.harvard. 
edu/pph2/[16]). When the putative mutations affected splice sites, 
the splicing site score values of the wild-type and variant sequences 
were predicted online with NetGene2 (http://www.cbs.dtu.dk/ 
services/NetGene2/[17,18]), Human Splicing Finder [19], and 
MaxEntScan (http://www.umd.be/HSF/pO]) prediction servers. 

Whole-Exome Sequencing 

Genomic DNA from peripheral blood was prepared using the 
QIAamp DNA Blood Maxi Kit (Qiagen). Exome capture was 
performed at the CNAG using SureSelectXT All Exome v4 kit 
(Agilent), following the manufacturer's protocol. Libraries were 
sequenced on a HiSeq2000 (Illumina) to at least 80x average 
coverage of the target region. 

Reads were aligned to the human reference genome build 
GRCh37 (hgl9) using the Burrows- Wheeler Aligner (BWA) [21] 
in family E5 and GEMMapper [22] in family 9RE. Mapped reads 
were filtered (leaving only those mapping in unique genomic 
positions with enough quality), sorted and indexed with SAMtools 
(version 0.1.18) [23]. Identification of single nucleotide variants 
and Indels was performed using GATK standard hard filtering 
parameters [24] (family E5) or SnpEff [25] (family 9RE). In house 
Perl scripts were used to select the variants shared by all affected 
individuals, predicted to produce a high or moderate impact, 
including intron-exon junctions, non-annotated variants (synony- 
mous, non-synonymous, and non-sense mutations) in coding 
regions, or short coding insertions or deletions. Variants mapping 
to the candidate genes were selected for further validation. For the 
final WES report the VARIANT [26] annotation tool provided 
the putative functional consequence, as well as other additional 
relevant information of the identified variants for the final 
candidate gene selection. 
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Figure 1 . Genes included in the RD-chip. The RD-chip contains the most relevant RD genes at the time of the array design (1 00 genes). Genes are 
listed by chromosome and physical location. The interval between the first and last genotyped SNP is shown as "Covered region" (the chromosome 
position is based on human reference sequence NCBI 36/hg18). Colored boxes indicate association with particular retinal dystrophies. CD/CRD: Cone 
or Cone-Rod Dystrophy; CSNB: Congenital Stationary Night Blindness; LCA: Leber Congenital Amaurosis; MD: Macular Degeneration; RP: Retinitis 
Pigmentosa; OR: Other Retinopathies. 
doi:1 0.1 371 /journal.pone.008841 0.gOOl 
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RT-PCR Expression Analysis 

Blood samples from patients, relatives and unrelated controls 
were mixed with an RNA stabilizer solution (RNALater; Ambion) 
in a 1:4 ratio. Total RNA was obtained from 3 ml of blood using 
the RiboPure-Blood Kit (Ambion, Austin, TX), and retrotran- 
scribed using the Transcriptor High Fidelity cDNA Synthesis Kit 
(Roche Applied Science, Indianapolis, IN) with a mixture of 
random hexamers and oligo(dT)18, according to the manufactur- 
er's instructions. RPGRIP1 and G3PDH (used as control) 
transcripts were amplified using specific exon primers and the 
GoTaq Flexi DNA polymerase (Promega, Fitchburg, WI) in a final 
volume of 50 ju.1. The G3PDH PGR conditions were: denaturation 
for 5 min at 94°C followed by 35 cycles of 20 s at 94°C, 30 s at 
60°G, and 1 min at 72°C, using 2 ul of cDNA. For RPGRIP1, 
primers were located in exons 14 and 16, and the PCR conditions 
were: denaturation for 5 min at 94°C followed by 38 cycles of 20 s 
at 94°C, 30 s at 60°C and 40 s at 72°C, using 5 ul of cDNA. 
Amplified bands were excised, purified from the gel using the 
Expin GeneAll Gel SV kit (GeneAll) following the manufacturer's 
protocol, and sequenced. 

Plasmid Constructions and Expression Assay 

The reconstructed RPGRIP1 minigene encompassed exons 1 2 to 
18, plus at least 200 bp of each intron-exon boundary, after 
amplification of genomic DNA from patient 1 of the 59RE 
pedigree (heterozygote for the c.2367+23del mutation). The 
minigene was cloned in-frame at the C-terminus of the HA 
epitope into the pcDNA3.1 expression vector. Wild-type and 
mutated clones were confirmed by sequencing. 

HEK293 cells were seeded on 12-well plates (4xl0 3 cells/well) 
and grown in DMEM (Invitrogen, Barcelona, Spain) supplement- 
ed with 10% of fetal bovine serum. After 12 hours, cells were 
transiently transfected with, either the pcDNA3. 1-wild type (wt) 
RPGRIP1 minigene, the pcDNA3.1-mut RPGRIP1 minigene 
(containing the c.2367+23del mutation), or the empty vector 
(Clontech-BD), using Lipofectamine 2000 (Invitrogen). Forty-eight 
hours after transfection, cells were collected, lysed, and total 
mRNA was used for RT-PCR (same protocol as above). RPGRIP1 
cDNA amplification was performed using primers of exons 1 5 and 
18 as follows: denaturation for 5 min at 94°G, followed by 35 
cycles of 20 s at 94°G, 30 s at 62°C, and 1 min 30 s at 72°C, using 
1 u.1 of cDNA. G3PDH amplification was used as a control. 

Results 

Classification of the Patients According to Phenotype 
and Family Information 

Our initial cohort comprised 41 families, 36 affected with non- 
sydromic RDs and 5 affected by Usher syndrome type II. We 
attained full diagnosis for 22 of them, whose pedigrees are 
displayed in Figure 2. Cosegregation with the novel mutated alleles 
identified is shown in Figure SI. 

Analysis of Usher Syndrome Cases 

Usher syndrome is characterized by specific phenotypic traits 
that allow a clear clinical characterization in three main forms, 
being USH II the most frequent type (between half and two thirds 
of all cases), and USH2A the major causative gene (75-80% of 
USH II cases) [27]. Our cohort had 5 USH II cases as isolated 
patients or small pedigrees, which when directly tested for 
described mutations in USH2A (Asper Biotech), only 4 out of 10 
mutated alleles were identified, all in heterozygosis. As no 
complete molecular diagnosis was achieved, direct mutational 
screening of all USH2A exons was then undertaken, prioritizing 



the analysis of exons where reported mutations cluster. Overall, 
three of the five patients (pedigrees shown in Figure 2) were 
completely diagnosed, with 3 missense mutations identified: 
c.l751G>T in heterozygosis (p.C584F), c.10636G>A in homo- 
zygosis (p.G3546R), and c. 12574C>T in heterozygosis 
(p.R4192C) (Table 1 A). The pathogenicity of the two novel 
variants, p.C584F and p.R4192C, was supported by the PolyPhen 
and SIFT algorythms (Table 2). In the remaining two patients only 
one mutated allele was identified (Table 1 A). 

Genotyping of RD Families 

Contrary to Usher syndrome, most RDs show high clinical and 
genetic heterogeneity, which gready hampers molecular diagnosis. 
Our approach was to use genetic information and cosegregation 
analysis to decrease the number of candidate genes for mutational 
screening. When pedigrees were available, the use of an 
automated and robust SNP-based genotyping microarray gready 
diminished the number of candidates. To this aim, 36 families 
(with at least four available samples) affected with retinal 
dystrophies (Retinitis Pigmentosa, Leber Congenital Amaurosis, 
Cone-rod Dystrophy or Cone Dystrophy) were analyzed with our 
optimized in house RD-chip that genotyped 768 SNP markers 
spanning the 100 most prevalent RD genes reported at that 
moment (Figure 1). Six families showed autosomal dominant 
inheritance and a large number of affected individuals; twenty- 
nine were autosomal recessive pedigrees with a low number of 
affected members, and the remaining two were compatible with 
X-linked inheritance. 

After RD-chip genotyping, haplotypes were constructed for 
each family to assess cosegregation under the presumed inheri- 
tance pattern. In pedigrees where 90 to 99% of candidates were 
discarded, direct mutational screening was performed in the 
remaining non-excluded genes. For each case, the candidates were 
prioritized according to previous gene assignment to: 1) the same 
clinical diagnosis and mendelian pattern; 2) a closely related 
retinal dystrophy with the same inheritance pattern; 3) the same 
clinical phenotype irrespective of the inheritance pattern, and 
finally 4) the remaining RD candidates. 

This approach allowed us to identify the pathogenic mutation in 
17 families out of 36 (47,2%), depicted in Figure 2:2/2 in X-linked 
families, 6/18 of recessive non-consanguineous families (33,3%), 
8/10 recessive consanguineous families (80%) and 1/6 of 
dominant families (16,6%). 

Inferred Haplotypes and Subsequent Mutation Screening 
of the Prioritized Candidates in X-linked and Dominant 
Pedigrees 

The genotyping results for each family were first analyzed under 
the most probable mendelian pattern of inheritance to exclude 
non-cosegregating genes and prioritize the remaining candidates. 
The final results are presented by family and summarized in 
Table 1 B. 

Seven pedigrees were compatible with an autosomal dominant 
(ad) pattern, but one (1 1NCE) could also be explained by an X- 
linked inheritance, as all affected women showed a less severe 
phenotype. In this case, the haplotypes were first analyzed under a 
X-linked hypothesis (see below). Of the six AD families, only 2 
(60RG and 2NCE) were amenable for mutational screening, 
while in the rest more than 10 candidates remained. Indeed, in 
dominant diseases a large number of samples is required to attain 
genetic informativity, which seldom occurs. In family 60RG, the 
RD-chip highlighted seven candidates, three of them responsible 
for adRP: SEMA4A, SNRNP200 and TOPORS. We prioritized the 
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Figure 2. Spanish pedigrees diagnosed in this study. Families were categorized as arUSH II, arRP, adRP, XL-RP, arLCA, arCRD and adCD 
according to mendelian pattern of inheritance and phenotype. Bold numbers indicate samples available for analysis. 
doi:1 0.1 371 /journal.pone.008841 0.g002 
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Table 1. Summary of the diagnosed families in this study. 





A) USH II families 


Family 


Phenotype 


Inheritance 




Nucleotide change 


Protein change 


Zygosity 


Refe rs n c ss 


40ORG 


Usher II 


recessive 


USH2A 


c.1751G>T 


p.C584F 


Het 


[a] 










c.2299delG b 


p.E767Sfs*20 


Het 


[50] 


54RE 


Usher II 


recessive 


USH2A 


C.10636OA 


p.G3546R 


Horn 


[34] 


50 RG 


Usher II 


recessive 


USH2A 


c.9799T>C b 


p.C3267R 


Het 


[33] 










C.125740T 


p.R4192C 


Het 


[a] 


56RE 


Usher II 


recessive 


USH2A 


c.2299delG b 


p.E767Sfs*20 


Het 


[50] 










n.i 


n.i 






94RE 


Usher II 


recessive 


USH2A 


c.2299delG b 


p.E767Sfs*20 


Het 


[50] 










n.i 


n.i 






B) Families analized by the RD-xip with identified mutations (sorted by gene). 


120RG 


RP 


recessive 


CRBI 


C.1702OT 


p.H568Y 


Horn 


[a] 


10RE 


LCA 


recessive 


CRBI 


c.3749+2_3749+3del 


splicing 


Homo/Het 


[a] 










c.2843G>A 


p.C948Y 


Het 


[30] 


23NCE 


RP 


recessive 


CRBI 


C.2290OT 


p.R764C 


Horn 


[30] 


25NCE 


RP 


recessive 


CRBI 


c.2843G>A 


p.C948Y 


Horn 


[30] 


T5 


RP 


recessive 


CR81 


c.2843G>A 


p.C948Y 


Horn 


[30] 


17NCE 


RP 


recessive 


MERTK 


c.2189+1G>T 


splicing 


Horn 


[31] 


22NCE 


RP 


recessive 


PR0M1 


c.1 984-1 G>T 


splicing 


Horn 


[a] 


83RE 


RP 


recessive 


RD3 


c.259A>G 


p.K87E 


Horn 


[a] 


11NCE 


RP 


X-linked 


RP2 


c.409_411del 


p.H37del 


Hemi 


[29] 


20NCE 


RP 


X-linked 


RP2 


All gene deletion 




Hemi 


[a] 


59RE 


LCA 


recessive 


RPGRIPI 


c.895_896del 


p.E299Sfs*21 


Het 


[a] 










c.2367+23del c 


intronic 


Het 


[a] 


60 RG 


RP 


dominant 


SNRNP200 


c.2042G>T 


p.R681L 


Het 


[a] 


18NCE 


RP 


recessive 


USH2A 


c.2276G>T 


p.C759F 


Het 


[32] 










c.9799T>C 


p.C3267R 


Het 


[33] 


21NCE 


RP 


recessive 


USH2A 


c.1434G>C 


p.E478D 


Het 


[51] 










c.2276G>T 


p.C759F 


Het 


[32] 


75RE 


RP 


recessive 


USH2A 


C.2209OT 


p.R737X 


Het 


[52] 










c.8693A>C 


p.Y2898S 


Het 


[a] 


5NCE 


RP 


recessive 


USH2A 


C.652-2A>G 


splicing 


Het 


[a] 










c.2276G>T 


p.C759F 


Het 


[32] 


93RE 


CRD 


recessive 


ABCA4 


c.3988G>T 


p.E1330X 


Het 


[a] 










C.6410OA 


p.C2137Y 


Het 


[a] 


C) Families with identified mutations by WES 


9RE 


CD 


dominant 


GUCY2D 


c.2747T>C 


P.I916T 


Het 


[a] 


E5 


RP 


recessive 


USH2A 


c.2167+5G>A 


splicing 


Het 


[53] 










c.4325T>C 


p.F1442S 


Het 


[35] 










c.7364G>A 


p.W2455X 


Het 


[a] 



For each family, the phenotype, inheritance model, the altered gene, the identified mutation, the homozygosity/heterozygosity state, and the reference of previously 
described mutations are indicated, [a] This study, b Mutations previously identified by APEX c Unknown pathogenicity, n "' Not identified. 
doi:1 0.1 371 /journal.pone.008841 0.tOOl 



analysis of SNKNP200, as the cosegregating haplotype extended to 
three neighbouring RD genes (CNNM4, CNGA3, MERTK), overall 
covering more than 16 Mb (Figure 1). Direct exonic sequence 
revealed a novel missense mutation in SMRNP200, c.2042G>T 
(p.R68 1L) (Table 1 B) in a codon also mutated in other adRP cases 
(C.2041OT, p.R681C and c.2042G>A, p.R681H) [28]. Bioin- 
formatics analysis showed that this residue was highly evolutionary 



conserved (data not shown) and predicted a damaging effect 
(Table 2). Moreover, none of the healthy siblings did carry the 
pathogenic variant, supporting its pathogenicity. In family 2NCE, 
after the RD chip analysis, ten candidates remained. Extended 
haplotypes with neighbouring RD genes decreased the number of 
candidates to five. Unfortunately, no mutation was found in any 
candidate (the pedigree is not included in Figure 2). 
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Table 2. Pathogenicity predictions for new missense 
mutations. 



SIFT PolyPhen-2 



Gene 


Mutation 


Score 


Prediction Score 


Prediction 


CRB1 


p.H568Y 


1 


Tolerated 


0,999 


Probably damaging 


GUCY2D 


P.I916T 


0,002 


Damaging 


1 


Probably damaging 


RD3 


p.K87E 


0,01 


Damaging 


0,997 


Probably damaging 


SNRNP200 


p.R681L 


0 


Damaging 


1 


Probably damaging 


USH2A 


p.C584F 


0 


Damaging 


1 


Probably damaging 


USH2A 


p.F1442S 


0 


Damaging 


1 


Probably damaging 


USH2A 


p.Y2898S 


0 


Damaging 


0,998 


Probably damaging 


USH2A 


p.R4192C 


0 


Damaging 


0,998 


Probably damaging 


ABCA4 


p.C2137Y 


0 


Damaging 


1 


Probably damaging 



Dash(— ) means no splice site predicted. 
doi:1 0.1 371 /journal.pone.008841 0.t002 



X-linked inheritance was assumed for 20NCE and 1 1NCE, the 
latter being also compatible with AD inheritance. SNP genotyping 
revealed a common deleted region comprising the full RP2 locus in 
all male patients of 20NCE, clearly underscoring RP2 as the 
disease-causing gene. Indeed, mutational screening confirmed the 
deletion of the whole coding region. Concerning pedigree 1 1NCE, 
the milder affectation of women -suggestive of an X-linked trait-, 
added to the cosegregation of the RP2 haplotype, both pinpointed 
this candidate for mutational screening. Exon sequencing identi- 
fied a previously reported mutation, c.409_41 ldel causing 
p.I137del, in all the family patients (Table 1 B) [29]. 

Recessive Consanguineous Families 

In five often known consanguineous families (10RE, 120RG, 
23NCE, T5, and 25NCE), CRB1 was the candidate of choice. 
Prioritization was established based on either an extended 
haplotype comprising the adjacent CFH locus in four pedigrees, 
or a shared haplotype with a previously diagnosed family, 
suggesting a founder effect. 

The patient II. 6 in the family 10RE (Figure 2) carried an 
homozygous unreported deletion in the intron 9 splice donor site 
of CRB1, c.3749+2_3749+3del, which ablated the splice signal. 
This outcome was confirmed by in silica predictions (Table 3). His 
nephew was a double heterozygote for this mutation plus a 
frequent pathogenic variant, c.2843G>A p.C948Y [30]. 

In family 120RG, all but four RD genes were discarded, of 
those, CRB1 showed an extended haplotype. Direct sequencing 
revealed a novel missense mutation in homozygosity, C.1702OT 
p.H568Y, whose pathogenicity was supported by in silico analyses 
(Table 2). On the other hand, families 23NCE and T5 showed 
homozygosity for the known missense mutations C.2290OT 
p.R764C, and c.2843G>A p.C948Y, respectively [30]. Of note, 
the haplotype of family T5 affected members was also shared by 
the patients of another family (25NCE). Subsequent sequencing 
analysis of CRB1 confirmed the same causative mutation, 
supporting common ancestry. 

Family 83RE showed an extended haplotype spanning USH2A 
and RD3. Given that the clinical diagnosis of the family was RP, 
USH2A was prioritized for direct mutational screening but was 
excluded after sequencing 72 exons. Screening of the RD3 
candidate, previously reported only as a LCA-causative gene, 
revealed a new missense mutation c.259A>G p.K87E in 
homozygosity. In silico predictions supported its pathogenicity 



(Table 2). Notably, this variant had been identified in 4 out of 
13.002 control chromosomes (NHLBI Exome Sequencing Proj- 
ect). 

The consanguineous family 75RE was first analyzed assuming 
homozygosity by descent, and under this assumption, all 
candidates were discarded. However, if non-consanguinity was 
assumed an extended haplotype spanning candidates USH2A and 
RD3 emerged. Subsequent mutational screening of USH2A 
identified two disease-causing mutations in the patients: the 
reported nonsense c.2209C<T p.R737X and the novel missense 
c.8693A>C p.Y2898S. The pathogenicity of the latter was fully 
supported by in silico predictions (Table 2). 

The non-discarded genes of family 17NCE were prioritized 
according to: 1) cosegregation and phenotype, pointing to LRAT, 
or 2) shared haplotype with a previously diagnosed Spanish family 
in MERTK [31]. LRATwas discarded after sequencing of all exons, 
whereas direct screening of MERTK exon 16 identified the 
expected mutation c.2189+lG>T in homozygosis, again support- 
ing a founder effect. 

Two remaining families were not further considered due to the 
lack of genetic informativity. 

Recessive Non-consanguineous Families 

After cosegregation analysis, three candidate genes remained in 
the 22NCE family (PROM1, RP1 and TEAD1). Although 
consanguinity had not been reported, the mutational screening 
otPROMl revealed a novel homozygous mutation, c.1984-1G>T, 
which ablates the consensus acceptor splice site of intron 17. Its 
pathogenicity was fully confirmed by in silico predictions (Table 3). 

In family 59RE, seven candidate genes cosegregated, although 
the clinical phenotype of the patients pointed RPGRIP1 and 
TUIP1 as the best candidates. TUIP1 did not bear any mutation, 
whereas two previously unreported variants in RPGRIP1 were 
identified. The variant c.895_896del, p.E299Sfs*21, was clearly 
pathogenic and produced a truncated protein. The other variant 
was intronic, c.2367+23del, and the possible pathogenic effect was 
unknown. In silico predictions for splice sites, splice enhancers and 
silencers did not reveal any strong molecular alteration. Besides, 
in vivo analysis of patient's mRNAs was restrained by the RPGRIP1 
low expression levels in blood. Finally, the transfection in cultured 
cells of minigenes bearing either the WT or the variant sequence 
did not conclusively support its pathogenicity. As the analysis of 
434 control chromosomes identified this variant once, the C.2367+ 
23del variant could be presumably classified as a rare indel, and its 
pathogenicity remains to be proved. 



Table 3. Pathogenicity predictions for new splicing 
mutations. 











NetGene2 


MaxEnt 


HSF 






(0-1) 


(score) 


(0-100) 


CRB1 


wt 


0,37 


9,6 


96,67 


c.3749+2_3749+3del - 


MERTK 


wt 


0,86 


5,58 


86,8 


c.2189+1G>T - 


PROM1 


wt 


0 


8,36 


84,3 


c.1984-1G>T - 


USH2A 


wt 


0,83 


4,89 


90,92 




c.652-2A>G 




0,56 







doi:1 0.1 371 /journal.pone.008841 0.t003 
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Figure 3. Cosegregation of USH2A mutations identified in 
family E5. Three heterozygous USH2A mutations cosegregate with the 
disease in this consanguineous family. Patient 11-4 carried two mutated 
alleles M2 (c.4325T>C) and M3 (c.7364G>A), whereas his affected son 
(111-2) inherited the M3 paternal allele plus the reported M1 (c.2167+ 
5G>A) mutation from his mother. 
doi:1 0.1 371 /journal.pone.008841 0.g003 

Three families compatible with a recessive RP inheritance 
(5NCE, 18NCE and 21NCE) showed cosegregation with 5-12 
candidates, USH2A among them. Given that direct genotype of 
mutations in the RD-chip had already detected a frequent USH2A 
pathogenic allele c.2276G>T, p.C759F, in heterozygosis [32], 
direct sequencing of the full coding sequence was undertaken. 
Data revealed one novel pathogenic allele in family 5NCE, c.652- 
2A>C, which ablates the acceptor splice site, and two reported 
mutations c.9799T>C p.C3267R (18NCE) and c.l434G>C 
p.E478D (21NCE); although the pathogenicity of this last variant 
is still controversial [32-34]. 

Family 93RE whose clinical diagnosis was compatible with 
either CRD or recessive Stargardt disease showed cosegregation 
with GNAT2, ZNF513, 0PA1, RP1L1 and ABCA4. Based on the 
type of inheritance and phenotype, we prioritized the analysis of 
GMAT2, ZNF513 and ABCA4. Two unreported mutations in 
ABCA4 were identified, the nonsense c.3988G>T p.E1330X and 
the missense c.6410G>A p.C2137Y variants. In silico predictions 



by PolyPhen2 and SIFT of the latter supported its pathogenicity 
(Table 2). 

WES Families 

After exclusion with the chip of the one hundred RD 
candidates, WES was undertaken in suitable remaining families. 
Concerning family E5, most recessive RD genes were discarded 
under the assumption of claimed consanguinity (LI and 1.2), and 
the rest of candidates was excluded by Sanger sequencing. WES 
was then undertaken for patients LI.4 and III.2 (Figure 2). 
Unexpectedly, one novel (c.7364G>A) and one recently reported 
mutation (c.4325T>C) [35] in USH2A were identified in patient 
LT.4, whereas his affected son (patient III.2) carried the c.4325T> 
C mutation from his father plus the reported pathogenic C.2167+ 
5G>A allele inherited from his mother. Haplotype analysis of the 
pedigree confirmed cosegregation of these pathogenic variants 
(Figure 3). In this family, the exceptional non-compliance with the 
homozygosity-by-descent assumption had excluded USH2A as the 
causative gene in the RD chip analysis. 

The 9RE pedigree (Figure 2) showed an autosomal dominant 
cone dystrophy. In this large cohort, the high genetic informativity 
allowed us to exclude most candidates after the RD cosegregation 
chip, and Sanger sequencing ruled out the few remaining genes. 
Exome sequencing was then undertaken and identified a new 
mutation in GUCT2D, c.2747T>C. This candidate had not been 
previously considered because the SNP haplotype analysis showed 
recombination within the 3' flanking region in one affected 
member (IV 5). Although the selected SNPs are closely linked to 
the genes to minimize it, recombination, however infrequent, may 
occur and mislead the analysis. 

RD-Chip Efficiency 

Table 4 summarizes the percentage of complete molecular 
diagnosis attained with the RD chip, categorized by inheritance 
pattern and the number of samples available per family. Six 
autosomal dominant families with at least 4 available samples (and 
a minimum of three patients) were analyzed, yet only one family, 
60RG, (7 samples with three affected members) was successfully 
diagnosed. Indeed, when dealing with dominant diseases, the main 
drawback for successful diagnosis is the large number of samples 
required to attain discriminative genetic informativity. In contrast, 
in recessive families, the percentage of success doubles to 33% (6 
out of 18) in non-consanguineous, and dramatically raises up to 
80% (8 out of 10), in consanguineous pedigrees. We conclude that 
the highest informativity of recessive pedigrees, particularly under 
the homozygosity-by-descent assumption in consanguinity, in- 
creases the efficiency of the RD chip and highlights its reliability 
for RD molecular diagnosis. 



Table 4. Families solved by the RD-chip according to the number of samples available and the inheritance pattern. 







Number of families solved/Total number of families 










4 samples 


5 samples 


6 samples 


7 samples 


8 samples 


>8 samples 


Total 


Dominants 




0/1 


0/1 


1/2 




0/2 


1/6 


Recessives 


4/7* 


2/6 


0/1 


0/2 


0/1 


0/1 


6/18 


Consanguineous 


2/2 


1/3 


1/1 




2/2 


2/2 


8/10 


X-linked 


1/1 












2/2 


Total 


7/10 


3/10 


1/3 


2/5 


2/3 


2/5 


17/36 



*One allel identified by direct genotiping in 3 cases. 
doi:1 0.1 371 /journal.pone.0099354.t004 
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Table 5. Comparison of efficiencies of different methods for RD molecular diagnosis. 







N Genes analyzed 


Yield 


Reference 


APEX 


1-16 


1 5-44% 


[44, 54] 


Long-PCR 


9 


33% 


[43] 


Autozygome 


16-100 


42-52% 


[48, 55] 


Target Capture 


105-179 


36-56% 


[11, 44-47, 56] 


WES 


All 


44-83%* 


[43, 48, 49, 57] 


WGS 


All 


50% 


[10] 


RD-chip 


100 


47% 


This study 



*Higher efficiency percentages are obtained when few families or cases are analysed. 
doi:1 0.1 371 /journal.pone.0099354.t005 



Finally, two X-linked cohorts were successfully diagnosed. The 
RD chip included markers for cosegregation analysis of the two 
known X-linked causative loci (RP2 and RPGR), given that X- 
linked inheritance can only be discarded in pedigrees with male- 
to-male transmission [36], and as a means to avoid the 
burdensome task of sequencing the hot-spot ORF15 of RPGR 
whenever cosegregation analysis does not support it. 

Discussion 

To meet the increasing demand for genetic diagnosis from 
clinicians and patients, we have devised and optimized an RD chip 
that allows us to extract maximum genetic informativity from 
pedigrees, exclude a large number of non-cosegregating candi- 
dates and focus on the most probable causative genes. Moreover, 
the exclusion of all analyzed candidates by the RD chip highlights 
the families suitable for next generation sequencing (NGS) and 
subsequent identification of novel RD genes. 

To improve the efficiency and the scope of the RD-Chip with 
respect previous attempts [12,37] we have: i) increased the number 
of SNPs (7-10) per gene to maximize the genetic informativity, ii) 
extended the cosegregation study to 100 retinal dystrophy genes, 
and iii) included some prevalent pathogenic point mutations in the 
Spanish population for direct genotyping. The RD-chip allows to 
genotype 768 selected SNPs in a single cost- and time-effective step 
and is designed to use on families, not on simplex cases. 

With this optimized RD-chip version, 17 out of 36 Spanish 
pedigrees have been fully diagnosed. Twelve new and 12 
previously reported pathological variants have been identified in 
9 RD genes, adding to the high genetic diversity in retinal 
disorders. The fact that as much as half of the mutations identified 
are new underscores the efficiency of our RD chip compared to 
direct mutational screening., particularly in non-homogenous 
genetic populations. Concerning the major candidates in our 
cohort, CRB1 and USH2A explain 10/19 families overall, in 
accordance with other reports [38,39] and consistent with their 
contribution to several clinical entities. CRB1 is responsible for RP 
and LCA (a more severe form of RP), and more than 150 
mutations have been described so far (http://www.hgmd.org). In 
our panel, the LCA pedigree (10RE) carried a novel splicing 
mutation, whereas the RP pedigrees (120RG, 23NCE, 25NCE, 
T5) were all associated to missense variants (Table 1). Our results 
agree with the Cftfi7-assigned phenotype-genotype correlations, 
where null alleles are mainly associated to the LCA phenotype 
[39,40]. Also in agreement with previous reports, p.C948Y is the 
most prevalent CRB1 mutation in our family panel (3 of the 5 
CRB1 families) [39]. 



Five non-syndromic RP families presented 4 new (2 missense, 
one nonsense and one splicing mutation) and 7 reported (5 
missense, one nonsense and one splicing mutation) pathogenic 
alleles in USH2A. In the Usher cohorts, analysis of USH2A 
rendered 2 new missense and 3 known (one frameshift and 2 
missense) mutations. In two families, only one of the pathogenic 
alleles was identified. In fact, some reports support that as much as 
35% of the second USH2A mutant alleles are duplications, 
deletions and deep intronic variants, which are extremely difficult 
to detect by DNA sequencing [27]. On the other hand, no clear 
phenotype-genotype correlation could be established between RP 
and Usher cases. The most prevalent USH2A mutation in non- 
syndromic RP, p.C759F [32,41], was also present in three families 
of our cohort. Notably, double heterozygosis in USH2A was 
unexpectedly found in two consanguineous families. In fact, in 
pedigree E5, three pathogenic alleles were identified in two 
generations. In this particular case, the assumption of homozy- 
gosity by descent led us to wrongly assume non-cosegregation with 
all RD genes and undertake WES analysis, which eventually 
identified an unexpected double heterozygous genotype. A 
seemingly higher number of mutation carriers had also been 
reported for other syndromic RPs, such as Bardet-Biedl, with no 
solid evidence for this finding [42] . 

The inclusion of X-linked markers in the RD-chip has proved to 
be extremely useful to diagnose families compatible with both 
autosomal and X-linked inheritance patterns. In fact, in pedigree 
1 1NCE, the milder affectation of the female patients was already 
suggestive of a pseudo-dominant effect, as it was indeed confirmed 
(RP2 was the causative gene). On the other hand, family 20NCE, 
with an unassigned mendelian pattern, cosegregated with X-linked 
markers, which prompted to focus on the X-linked candidates. A 
deletion comprising the RP2 locus was identified (Table 1). 

Remarkably, the clinical heterogeneity of retinal disorders was 
highlighted by the identification of RD3-up to now associated only 
to LCA— as causative of RP (Table 1), increasing the phenotypes 
associated to the gene mutations. This case would have remained 
unassigned by conventional methods had it been not for the 
comprehensive analysis of our RD chip, whose main asset is the 
robustness of genetic information to highlight the most probable 
candidates, avoiding the yet burdensome task of big data analysis. 
Particularly, identification of shared (which indicate a common 
founder effect) or extended haplotypes over closely mapping genes 
(which strengthen cosegregation) are invaluable clues to directly 
pinpoint the causative mutation, unveil unexpected candidates, 
and/ or prompt re-evaluation of clinical features. 

Recently, a variety of NGS-based procedures have been 
developed for molecular diagnosis of RDs, from targeted long- 
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range PCR coupled to NGS [43,44], targeted capture and 
sequencing of one or several RD genes [45-47], to full WES 
analysis [1 1,48,49]. Although powerful tools, they still yield limited 
complete diagnostic success: from 37%— 52% in non-related 
cohorts to 80% in cohorts with high consanguinity levels [48] 
(Table 5). Aside high costs, the restraints of NGS-targeted 
approaches are due to the high genetic heterogeneity of retinal 
disorders, whereas the main drawbacks of WES are the high 
sequence coverage requirement and the functional interpretation 
of the identified variants (WES). Within this context, the RD-chip 
efficiency ranges from 16% in dominant to 80% in consanguin- 
eous recessive pedigrees, with an average of 47%, well within the 
upper range of the NGS approaches. Besides, the design of this 
chip is extremely flexible, which allows to incorporate new SNPs to 
expand the gene repertoire after new discoveries or upon demand. 
Before NGS paves the future of personalized diagnosis, our cost- 
and time-effective strategy allows a quick and reliable prioritiza- 
tion of candidates, which is suitable and affordable for middle-size 
diagnostic labs with moderate to high number of family cases. 
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